Search CORE

371 research outputs found

Cost-Effective HITs for Relative Similarity Comparisons

Author: Belongie Serge J.
Kwak Iljung S.
Wilber Michael J.
Publication venue
Publication date: 12/04/2014
Field of study

Similarity comparisons of the form "Is object a more similar to b than to c?" are useful for computer vision and machine learning applications. Unfortunately, an embedding of

n

points is specified by

n^3

triplets, making collecting every triplet an expensive task. In noticing this difficulty, other researchers have investigated more intelligent triplet sampling techniques, but they do not study their effectiveness or their potential drawbacks. Although it is important to reduce the number of collected triplets, it is also important to understand how best to display a triplet collection task to a user. In this work we explore an alternative display for collecting triplets and analyze the monetary cost and speed of the display. We propose best practices for creating cost effective human intelligence tasks for collecting triplets. We show that rather than changing the sampling algorithm, simple changes to the crowdsourcing UI can lead to much higher quality embeddings. We also provide a dataset as well as the labels collected from crowd workers.Comment: 7 pages, 7 figure

arXiv.org e-Print Archive

CiteSeerX

Overcomplete steerable pyramid filters and rotation invariance

Author: Anderson C. H.
Belongie S.
Goodman R.
Greenspan H.
Perona P.
Rakshit S.
Publication venue: IEEE Computer Society Press
Publication date: 01/01/1994
Field of study

A given (overcomplete) discrete oriented pyramid may be converted into a steerable pyramid by interpolation. We present a technique for deriving the optimal interpolation functions (otherwise called 'steering coefficients'). The proposed scheme is demonstrated on a computationally efficient oriented pyramid, which is a variation on the Burt and Adelson (1983) pyramid. We apply the generated steerable pyramid to orientation-invariant texture analysis in order to demonstrate its excellent rotational isotropy. High classification rates and precise rotation identification are demonstrated

CiteSeerX

Crossref

Caltech Authors

Enhanced decoding for the Galileo S-band mission

Author: Belongie M.
Dolinar S.
Publication venue
Publication date
Field of study

A coding system under consideration for the Galileo S-band low-gain antenna mission is a concatenated system using a variable redundancy Reed-Solomon outer code and a (14,1/4) convolutional inner code. The 8-bit Reed-Solomon symbols are interleaved to depth 8, and the eight 255-symbol codewords in each interleaved block have redundancies 64, 20, 20, 20, 64, 20, 20, and 20, respectively (or equivalently, the codewords have 191, 235, 235, 235, 191, 235, 235, and 235 8-bit information symbols, respectively). This concatenated code is to be decoded by an enhanced decoder that utilizes a maximum likelihood (Viterbi) convolutional decoder; a Reed Solomon decoder capable of processing erasures; an algorithm for declaring erasures in undecoded codewords based on known erroneous symbols in neighboring decodable words; a second Viterbi decoding operation (redecoding) constrained to follow only paths consistent with the known symbols from previously decodable Reed-Solomon codewords; and a second Reed-Solomon decoding operation using the output from the Viterbi redecoder and additional erasure declarations to the extent possible. It is estimated that this code and decoder can achieve a decoded bit error rate of 1 x 10(exp 7) at a concatenated code signal-to-noise ratio of 0.76 dB. By comparison, a threshold of 1.17 dB is required for a baseline coding system consisting of the same (14,1/4) convolutional code, a (255,223) Reed-Solomon code with constant redundancy 32 also interleaved to depth 8, a one-pass Viterbi decoder, and a Reed Solomon decoder incapable of declaring or utilizing erasures. The relative gain of the enhanced system is thus 0.41 dB. It is predicted from analysis based on an assumption of infinite interleaving that the coding gain could be further improved by approximately 0.2 dB if four stages of Viterbi decoding and four levels of Reed-Solomon redundancy are permitted. Confirmation of this effect and specification of the optimum four-level redundancy profile for depth-8 interleaving is currently being done

NASA Technical Reports Server

Enhanced decoding for the Galileo low-gain antenna mission: Viterbi redecoding with four decoding stages

Author: Belongie M.
Dolinar S.
Publication venue
Publication date
Field of study

The Galileo low-gain antenna mission will be supported by a coding system that uses a (14,1/4) inner convolutional code concatenated with Reed-Solomon codes of four different redundancies. Decoding for this code is designed to proceed in four distinct stages of Viterbi decoding followed by Reed-Solomon decoding. In each successive stage, the Reed-Solomon decoder only tries to decode the highest redundancy codewords not yet decoded in previous stages, and the Viterbi decoder redecodes its data utilizing the known symbols from previously decoded Reed-Solomon codewords. A previous article analyzed a two-stage decoding option that was not selected by Galileo. The present article analyzes the four-stage decoding scheme and derives the near-optimum set of redundancies selected for use by Galileo. The performance improvements relative to one- and two-stage decoding systems are evaluated

NASA Technical Reports Server

Many-to-Many Graph Matching: a Continuous Relaxation Approach

Author: H.A. Almohamad
H.W. Kuhn
M. Carcassoni
M. Neuhaus
M. Zaslavskiy
S. Belongie
S. Umeyama
T. Caelli
Y. Nesterov
Publication venue
Publication date: 01/11/2009
Field of study

Graphs provide an efficient tool for object representation in various computer vision applications. Once graph-based representations are constructed, an important question is how to compare graphs. This problem is often formulated as a graph matching problem where one seeks a mapping between vertices of two graphs which optimally aligns their structure. In the classical formulation of graph matching, only one-to-one correspondences between vertices are considered. However, in many applications, graphs cannot be matched perfectly and it is more interesting to consider many-to-many correspondences where clusters of vertices in one graph are matched to clusters of vertices in the other graph. In this paper, we formulate the many-to-many graph matching problem as a discrete optimization problem and propose an approximate algorithm based on a continuous relaxation of the combinatorial problem. We compare our method with other existing methods on several benchmark computer vision datasets.Comment: 1

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

HAL-MINES ParisTech

A robust braille recognition system

Author: B.C. Lin
C. Arcelli
F. Ghorbel
J. Llado
J.-Y. Ramel
L. Yan
L.P. Cordella
S. Belongie
S. Tabbone
S.O. Belkasim
S.R. Deans
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2004
Field of study

Braille is the most effective means of written communication between visually-impaired and sighted people. This paper describes a new system that recognizes Braille characters in scanned Braille document pages. Unlike most other approaches, an inexpensive flatbed scanner is used and the system requires minimal interaction with the user. A unique feature of this system is the use of context at different levels (from the pre-processing of the image through to the post-processing of the recognition results) to enhance robustness and, consequently, recognition results. Braille dots composing characters are identified on both single and double-sided documents of average quality with over 99% accuracy, while Braille characters are also correctly recognised in over 99% of documents of average quality (in both single and double-sided documents)

University of Salford Institutional Repository

Crossref

INRIA a CCSD electronic archive server

Word matching using single closed contours for indexing handwritten historical documents

Author: Alan F. Smeaton
C.C. Teppert
D. Cheng
F. Mokhtarian
L. Vincent
L.K. Huang
Noel E. O’Connor
R.F. Farag
S. Belongie
S. Madhvanath
S. Madhvanath
S. Madhvanath
Tomasz Adamek
W. Niblack
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/04/2007
Field of study

Effective indexing is crucial for providing convenient access to scanned versions of large collections of historically valuable handwritten manuscripts. Since traditional handwriting recognizers based on optical character recognition (OCR) do not perform well on historical documents, recently a holistic word recognition approach has gained in popularity as an attractive and more straightforward solution (Lavrenko et al. in proc. document Image Analysis for Libraries (DIAL’04), pp. 278–287, 2004). Such techniques attempt to recognize words based on scalar and profile-based features extracted from whole word images. In this paper, we propose a new approach to holistic word recognition for historical handwritten manuscripts based on matching word contours instead of whole images or word profiles. The new method consists of robust extraction of closed word contours and the application of an elastic contour matching technique proposed originally for general shapes (Adamek and O’Connor in IEEE Trans Circuits Syst Video Technol 5:2004). We demonstrate that multiscale contour-based descriptors can effectively capture intrinsic word features avoiding any segmentation of words into smaller subunits. Our experiments show a recognition accuracy of 83%, which considerably exceeds the performance of other systems reported in the literature

Crossref

Irish Universities

DCU Online Research Access Service

Real-Time Hand Shape Classification

Author: A. Erol
D. Huttenlocher
J. Nalepa
J. Nalepa
J. Wachs
M. Kawulok
M. Papiez
M.K. Hu
P. Phillips
S. Belongie
T. Grzejszczak
Y. Shen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

The problem of hand shape classification is challenging since a hand is characterized by a large number of degrees of freedom. Numerous shape descriptors have been proposed and applied over the years to estimate and classify hand poses in reasonable time. In this paper we discuss our parallel framework for real-time hand shape classification applicable in real-time applications. We show how the number of gallery images influences the classification accuracy and execution time of the parallel algorithm. We present the speedup and efficiency analyses that prove the efficacy of the parallel implementation. Noteworthy, different methods can be used at each step of our parallel framework. Here, we combine the shape contexts with the appearance-based techniques to enhance the robustness of the algorithm and to increase the classification score. An extensive experimental study proves the superiority of the proposed approach over existing state-of-the-art methods.Comment: 11 page

arXiv.org e-Print Archive

Crossref

Face analysis using curve edge maps

Author: D. Cristinacce
F. Deboeverie
F. Deboeverie
F. Deboeverie
J. Canny
K.J. Karande
L. Ding
L. Liang
O. Jesorsky
P. Veelaert
S. Belongie
S.-H. Cha
T.F. Cootes
X. Liu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

This paper proposes an automatic and real-time system for face analysis, usable in visual communication applications. In this approach, faces are represented with Curve Edge Maps, which are collections of polynomial segments with a convex region. The segments are extracted from edge pixels using an adaptive incremental linear-time fitting algorithm, which is based on constructive polynomial fitting. The face analysis system considers face tracking, face recognition and facial feature detection, using Curve Edge Maps driven by histograms of intensities and histograms of relative positions. When applied to different face databases and video sequences, the average face recognition rate is 95.51%, the average facial feature detection rate is 91.92% and the accuracy in location of the facial features is 2.18% in terms of the size of the face, which is comparable with or better than the results in literature. However, our method has the advantages of simplicity, real-time performance and extensibility to the different aspects of face analysis, such as recognition of facial expressions and talking

Crossref

Ghent University Academic Bibliography

Topological descriptors for 3D surface analysis

Author: A Othmani
AE Johnson
C Seiffert
DG Lowe
H Edelsbrunner
HJ Poincaré
J Wohlfeil
M Juda
RM Haralick
S Belongie
T Ojala
U Bauer
V López
Y Freund
Z Guo
Publication venue
Publication date: 01/01/2016
Field of study

We investigate topological descriptors for 3D surface analysis, i.e. the classification of surfaces according to their geometric fine structure. On a dataset of high-resolution 3D surface reconstructions we compute persistence diagrams for a 2D cubical filtration. In the next step we investigate different topological descriptors and measure their ability to discriminate structurally different 3D surface patches. We evaluate their sensitivity to different parameters and compare the performance of the resulting topological descriptors to alternative (non-topological) descriptors. We present a comprehensive evaluation that shows that topological descriptors are (i) robust, (ii) yield state-of-the-art performance for the task of 3D surface analysis and (iii) improve classification performance when combined with non-topological descriptors.Comment: 12 pages, 3 figures, CTIC 201

arXiv.org e-Print Archive

Crossref

Jagiellonian Univeristy Repository